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Abstract 

Dataflow languages provide natural support for specifying con- 
straints between objects in dynamic applications, where programs 
need to react efficiently to changes of their environment. Re- 
searchers have long investigated how to take advantage of dataflow 
constraints by embedding them into procedural languages. Previ- 
ous mixed imperative/dataflow systems, however, require syntactic 
extensions or libraries of ad hoc data types for binding the imper- 
ative program to the dataflow solver In this paper we propose a 
novel approach that smoothly combines the two paradigms without 
placing undue burden on the programmer 

In our framework, programmers can define ordinary commands 
of the host imperative language that enforce constraints between 
objects stored in special memory locations designated as "reac- 
tive". Differently from previous approaches, reactive objects can 
be of any legal type in the host language, including primitive data 
types, pointers, arrays, and structures. Commands defining con- 
straints are automatically re-executed every time their input mem- 
ory locations change, letting a program behave like a spreadsheet 
where the values of some variables depend upon the values of other 
variables. The constraint solving mechanism is handled transpar- 
ently by altering the semantics of elementary operations of the host 
language for reading and modifying objects. We provide a formal 
semantics and describe a concrete embodiment of our technique 
into C/C-l~l-, showing how to implement it efficiently in conven- 
tional platforms using off-the-shelf compilers. We discuss common 
coding idioms and relevant applications to reactive scenarios, in- 
cluding incremental computation, observer design pattern, and data 
structure repair The performance of our implementation is com- 
pared to ad hoc problem-specific change propagation algorithms, as 
well as to language-centric approaches such as self-adjusting com- 
putation and subject/observer communication mechanisms, show- 
ing that the proposed approach is efficient in practice. 

Categories and Subject Descriptors D.3.3 [Programming 
Languages]: Language Constructs and Features — Constraints 



General Terms Algorithms, design, experimentation, languages. 

Keywords Reactive programming, dataflow programming, im- 
perative programming, constraint solving, incremental computa- 
tion, observer design pattern, data structure repair. 



1. Introduction 

A one-way, dataflow constraint is an equation of the form 
y — f{xi, . . . , Xn) in which the formula on the right side 
is automatically re-evaluated and assigned to the variable 
y whenever any variable Xi changes. If y is modified from 
outside the constraint, the equation is left temporarily unsat- 
isfied, hence the attribute "one-way". Dataflow constraints 
are recognized as a powerful programming methodology in 
a variety of contexts because of their versatility and sim- 
plicity |38]. The most widespread application of dataflow 
constraints is perhaps embodied by spreadsheets llH^I. In 
a spreadsheet, the user can specify a cell formula that de- 
pends on other cells: when any of those cells is updated, the 
value of the first cell is automatically recalculated. Rules in 
a makefile are another example of dataflow constraints: a 
rule sets up a dependency between a target file and a list of 
input files, and provides shell commands for rebuilding the 
target from the input files. When the makefile is run, if any 
input file in a rule is discovered to be newer than the tar- 
get, then the target is rebuilt. The dataflow principle can be 
also applied to software development and execution, where 
the role of a cell/file is replaced by a program variable. This 
approach has been widely explored in the context of interac- 
tive applications, multimedia animation, and real-time sys- 
tems 1 13, 24, 34,39J. 

Since the values of program variables are automatically 
recalculated upon changes of other values, the dataflow com- 
putational model is very different from the standard imper- 
ative model, in which the memory store is changed explic- 
itly by the program via memory assignments. The execution 
flow of applications running on top of a dataflow environ- 
ment is indeed data-driven, rather than control-driven, pro- 
viding a natural ground for automatic change propagation in 
all scenarios where programs need to react to modifications 
of their environment. Implementations of the dataflow prin- 
ciple share some common issues with self-adjusting compu- 
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tation, in which programs respond to input changes by up- 
dating automatically their output [SlHSl- 

Differently from purely declarative constraints [Q, data- 
flow constraints are expressed by means of (imperative) 
methods whose execution makes a relation satisfied. This 
programming style is intuitive and readily accessible to a 

I— T[ 

broad range of developers [38], since the ability to smoothly 
combine different paradigms in a unified framework makes 
it possible to take advantage of different programming styles 
in the context of the same application. The problem of in- 
tegrating imperative and dataflow programming has already 
been the focus of previous work in the context of specific 
application domains | U , 32 - 34 , 38|] . Previous mixed imper- 
ative/dataflow systems are based on libraries of ad hoc data 
types and functions for representing constraint variables and 
for binding the imperative program to the constraint solver. 
One drawback of these approaches is that constraint vari- 
ables can only be of special data types provided by the run- 
time library, causing loss of flexibility and placing undue 
burden on the programmer A natural question is whether the 
dataflow model can be made to work with general-purpose, 
imperative languages, such as C, without adding syntactic 
extensions and ad hoc data types. In this paper we affirma- 
tively answer this question. 

Our Contributions. We present a general-purpose frame- 
work where programmers can specify generic one-way con- 
straints between objects of arbitrary types stored in reactive 
memory locations. Constraints are written as ordinary com- 
mands of the host imperative language and can be added 
and removed dynamically at run time. Since they can change 
multiple objects within the same execution, they are multi- 
output. The main feature of a constraint is its sensitivity to 
modifications of reactive objects: a constraint is automati- 
cally re-evaluated whenever any of the reactive locations it 
depends on is changed, either by the imperative program, or 
by another constraint. A distinguishing feature of our ap- 
proach is that the whole constraint solving mechanism is 
handled transparently by altering the semantics of elemen- 
tary operations of the host imperative language for reading 
and modifying objects. No syntax extensions are required 
and no new primitives are needed except for adding/remov- 
ing constraints, allocating/deallocating reactive memory lo- 
cations, and controlling the granularity of solver activations. 
Differently from previous approaches, programmers are not 
forced to use any special data types provided by the language 
extension, and can resort to the full range of conventional 
constructs for accessing and manipulating objects offered by 
the host language. In addition, our framework supports all 
the other features that have been recognized to be important 
in the design of dataflow constraint systems |38], including: 

Arbitrary code: constraints consist of arbitrary code that is 
legal in the underlying imperative language, thus includ- 
ing loops, conditionals, function calls, and recursion. 



Address dereferencing: constraints are able to reference 
variables indirectly via pointers. 

Automatic dependency detection: constraints automatically 
detect the reactive memory locations they depend on dur- 
ing their evaluation, so there is no need for program- 
mers to explicitly declare dependencies, which are also 
allowed to vary over time. 

We embodied these principles into an extension of CIC++ 
that we called DC. Our extension has exactly the same syn- 
tax as C/C-H-, but a different semantics. Our main contribu- 
tions are reflected in the organization of the paper and can 
be summarized as follows: 

• In Section |2] we abstract our mechanism showing how 
to extend an elementary imperative language to support 
one-way dataflow constraints using reactive memory. We 
distinguish between three main execution modes: nor- 
mal, constraint, and scheduling. We formally describe 
our mixed imperative/dataflow computational model by 
defining the interactions between these modes and pro- 
viding a formal semantics of our mechanism. 

• In Section[3]we discuss convergence of the dataflow con- 
straint solver by modeling the computation as an iterative 
process that aims at finding a common fixpoint for the 
current set of constraints. We identify general constraint 
properties that let the solver terminate and converge to a 
common fixpoint independently of the scheduling strat- 
egy. This provides a sound unifying framework for solv- 
ing both acyclic and cyclic constraint systems. 

• In Section|4]we describe the concrete embodiment of our 
technique into C/C-H-, introducing the main features of 
DC. DC has exactly the same syntax as C/C-H-, but oper- 
ations that read or modify objects have a different seman- 
tics. All other primitives, including creating and deleting 
constraints and allocating and deallocating reactive mem- 
ory blocks, are provided as runtime library functions. 

• In Section |5] we give a variety of elementary and ad- 
vanced programming examples and discuss how DC can 
improve CIC++ programmability in three relevant appli- 
cation scenarios: incremental computation, implementa- 
tion of the observer software design pattern, and data 
structure checking and repair To the best of our knowl- 
edge, these applications have not been explored before in 
the context of dataflow programming. 

• In Section |6] we describe how DC can be implemented 
using off-the-shelf compilers on conventional platforms 
via a combination of runtime libraries, hardware/operat- 
ing system support, and dynamic code patching, without 
requiring any source code preprocessing. 

• In Section|2]we perform an extensive experimental anal- 
ysis of DC in a variety of settings, showing that our im- 
plementation is effective in practice. We consider both 
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interactive applications and computationally demanding 
benchmarks that manipulate lists, grids, trees, matrices, 
and graphs. We assess the performances of DC against 
conventional C-based implementations as well as against 
competitors that can quickly react to input changes, i.e., 
ad hoc dynamic algorithms, incremental solutions real- 
ized in CEAL [25] (a state-of-the-art C-based frame- 
work for self-adjusting computation), and Qt's signal- 
slot implementation of the subject/observer communica- 
tion mechanism 12211 . 




Related work is discussed in Section |8] and directions for 
future research are sketched in Section |9] 

2. Abstract Model 

To describe our approach, we consider an elementary im- 
perative language and we show how to extend it to support 
one-way dataflow constraints. We start from While 13511 . an 
extremely simple language of commands including a sub- 
language of expressions. Although While does not sup- 
port many fundamental features of concrete imperative lan- 
guages (including declarations, procedures, dynamic mem- 
ory allocation, type checking, etc.), it provides all the build- 
ing blocks for a formal description of our mechanism, ab- 
stracting away details irrelevant for our purposes. We dis- 
cuss how to modify the semantics of While to integrate a 
dataflow constraint solver. We call the extended language 
DWhile. DWhile is identical to While except for a dif- 
ferent semantics and additional primitives for adding/delet- 
ing constraints dynamically and for controlling the granular- 
ity of solver activations. As we will see in Section |4] these 
primitives can be supported in procedural languages as run- 
time library functions. 

2.1 The DWhile Language 

The abstract syntax of DWHILE is shown in Figure [T] The 
language distinguishes be- 
tween commands and ex- 
pressions. We use c, ci, 
C2 as meta-variables rang- 
ing over the set of com- 
mands Comm, and e, ei, 
62 as meta-variables rang- 
ing over the set of ex- 
pressions Exp. Canonical 
forms of expressions are 
either storage locations £ e 
Lac, or storable values v 
over some arbitrary do- 
main Val. Expressions can 
be also obtained by applying to sub-expressions any primi- 
tive operations defined over domain Val (e.g., plus, minus, 
etc.). Commands include: 



e e Exp ::= ^ I n I (e) I . . . 

c £ Comm ::= 
skip 

£:=e 
ci ; C2 

if e tlien ci else C2 
while e do c 

newcons c 
delcons c 
begin_at c end_at 



Figure 1. Abstract syntax 
of DWhile. 



Figure 2. Transitions between different execution modes. 



• Assignments of values to storage locations {£ := e). These 
commands are the basic state transformers. 

• Constructs for sequencing, conditional execution, and it- 
eration, with the usual meaning. 

• Two new primitives, newcons and delcons, for adding 
and deleting constraints dynamically. Notice that a con- 
straint in DWhile is just an ordinary command. 

• An atomic block construct, begin_at c end_at, that 
executes a command c atomically so that any constraint 
evaluation is deferred until the end of the block. This 
offers fine-grained control over solver activations. 

In Section |4] we will show a direct application of the con- 
cepts developed in this section to the C/C++ programming 
languages. 

2.2 Memory Model and Execution Modes 

Our approach hinges upon two key notions: reactive memory 
locations and constraints. Reactive memory can be read and 
written just like ordinary memory. However, differently from 
ordinary memory: 

1 . If a constraint c reads a reactive memory location £ during 
its execution, a dependency {£, c) of c from £ is logged in 
a set D of dependencies. 

2. If the value stored in a reactive memory location £ is 
changed, all constraints depending on £ (i.e., all con- 
straints c such that {£,c) £ D) are automatically re- 
executed. 

Point 2 states that constraints are sensitive to modifications 
of the contents of the reactive memory. Point 1 shows how to 
maintain dynamically the set D of dependencies needed to 
trigger the appropriate constraints upon changes of reactive 
memory locations. We remark that re-evaluating a constraint 
c may completely change the set of its dependencies: prior 
to re-execution, all the old dependencies (— ,c) G _D are 
discarded, and new dependencies are logged in D during the 
re-evaluation of c. 

As shown in Figure |2] at any point in time the execution 
can be in one of three modes: normal execution, constraint 
execution, or scheduling. As we will see more formally later 
in this section, different instructions (such as reading a re- 
active memory location or assigning it with a value) may 
have different semantics depending on the current execution 
mode. 
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We assume eager constraint evaluation, i.e., out-of-date 
constraints are brought up-to-date as soon as possible. This 
choice is better suited to our framework and, as previous ex- 
perience has shown, lazy and eager evaluators typically de- 
liver comparable performance in practice [38]. Eager eval- 
uation is achieved as follows. A scheduler maintains a data 
structure S containing constraints to be first executed or re- 
evaluated. As an invariant property, S is guaranteed to be 
empty during normal execution. As soon as a reactive mem- 
ory location £ is written, the scheduler queries the set D of 
dependencies and adds to S all the constraints depending on 
i. These constraints are then run one-by-one in constraint 
execution mode, and new constraints may be added to S 
throughout this process. Whenever S becomes empty, nor- 
mal execution is resumed. 

An exception to eager evaluation is related to atomic 
blocks. The execution of an atomic block c is regarded as an 
uninterruptible operation: new constraints created during the 
evaluation of c are just added to S. When c terminates, for 
each reactive memory location £ whose value has changed, 
all the constraints depending on £ are also added to S, and 
the solver is eventually activated. Constraint executions are 
uninterruptible as well. 

We remark that any scheduling mechanism may be used 
for selecting from S the next constraint to be evaluated: in 
this abstract model we rely on a function pick that imple- 
ments any appropriate scheduling strategy. 

2.3 Configurations 

A configuration of our system is a six-tuple 

{p,a,(T,D, S, Csei f ) eTZx Bool x S x Dep x 2*^°"* x Cons 
where: 

» TZ = {p : Loc — > { normal, reactive }} is a set of 
store attributes, i.e.. Boolean functions specifying which 
memory locations are reactive. 

• Bool = {true, false} is the set of Boolean values. 

• — {(J : Loc —5- Vol} is a set of stores mapping storage 
locations to storable values. 

• Cons is the set of constraints and 2'-^°"'* denotes its 
power set. A constraint can be any command in D While, 
i.e.. Cons = Comm. We use different names for the 
sake of clarity. 

• Dep = 2^°'^^*-^°"'* is the set of all subsets of dependen- 
cies of constraints from reactive locations. 

Besides a store a and its attribute p, a configuration includes: 

• a Boolean flag a that is true inside atomic blocks and is 
used for deferring solver activations; 

• the set D of dependencies, D C Loc x Cons; 

• the scheduling data structure S C Cons discussed above; 

• a meta-variable Csei / that denotes the current constraint 
(i.e., the constraint that is being evaluated) in constraint 



execution mode, and is undefined otherwise. If the sched- 
uler were deterministic, Cgeif may be omitted from the 
configuration, but we do not make this assumption in this 
paper. 

2.4 Operational Semantics 

Most of the operational semantics of the DWhile lan- 
guage can be directly derived from the standard semantics 
of While. The most interesting aspects of our extension in- 
clude reading and writing the reactive memory, adding and 
deleting constraints, excuting commands atomically, and 
defining the behavior of the scheduler and its interactions 
with the other execution modes. Rules for these aspects are 
given in Figure 2] and are discussed below. 

Let C (S X Exp) X Val and =>c ^ (S x Comm) x S 
be the standard big-step transition relations used in the oper- 
ational semantics of the While language 1,351 . Besides =4>e 
and =>c, we use additional transition relations for expression 
evaluation in constraint mode ( =>ce), command execution 
in normal mode ( =>nc), command execution in constraint 
mode ( =>cc), and constraint solver execution in scheduling 
mode ( as defined in Figure [3] Notice that expression 
evaluation in normal mode can be carried on directly by 
means of transition relation of While. As discussed 
below, relation =>ce is obtained by appropriately modifying 
=>e- Similarly, relations and are obtained by ap- 
propriately modifying =>c- All the rules not reported in Fig- 
ure!?] can be derived in a straightforward way from the cor- 
responding rules in the standard semantics of While Ii35ll . 

The evaluation of a DWhile program is started by rule 
EVAL, which initializes the atomic flag a to false and both 
the scheduHng queue S and the set D of dependencies to the 
empty set. 

Writing Memory. Assigning an ordinary memory loca- 
tion in normal execution mode (rule ASGN-Nl) just changes 
the store as in the usual semantics of While. This is also 
the case when the new value of the location to be assigned 
equals its old value or inside an atomic block. Otherwise, if 
the location £ to be assigned is reactive, the new value dif- 
fers from the old one, and execution is outside atomic blocks 
(rule Asgn-n2), constraints depending on £ are scheduled 
in S and are evaluated one-by-one. As we will see, the tran- 
sition relation guarantees S to be empty at the end of the 
constraint solving phase. In conformity with the atomic ex- 
ecution of constraints, assignment in constraint mode (rule 
AsGN-c) just resorts to ordinary assignment in While for 
both normal and reactive locations. We will see in rule 
SOLVER-2, however, that constraints can be nevertheless 
scheduled by other constraints if their execution changes the 
contents of reactive memory locations. 

Reading Memory. Reading an ordinary memory location 
in constraint execution mode (rule Deref-cI) just evaluates 
the location to its value in the current store: this is achieved 
by using transition relation of the While semantics. 
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^ C (7^ X S X Comm) x E 


{p,G,c) ^ {a') 




=^ce C (72. X E X Cons x Dep x Exp) x (Dep x Vo/) 


{P, 0", Cself,D,e) 


^ce (D',?;) 


=^nc C (72 x Boo/ X S X Dep x 2'^°"" x C7omm) x (E x Dep x 2"^°"") 


{p, a, cr, D, 5, c) 


^„c {<t',D',S'} 


^ec C (72 X E X Dep x 2^°"" x Cons x Comm) x (E x Dep x 2'^°"") 


{p,a,D,S,Cseif, 


c) {cr',D',S'} 


C (72 X E X Dep x 2<^°"") x (E x Dep) 


(p,a,D,S) 


{a',D'} 



Figure 3. Transition relations for DWhile program evaluation (=>), expression evaluation in constraint mode (=>ce). 
command execution in normal mode ( =>nc)> command execution in constraint mode ( =>cc)> and constraint solver execution in 
scheduling mode 



p,a,S h (cr, D, c) ^nc {cr' , D') 



p h (cr, c) =^ cr' 



where: 



a = false 

D = 
5' = 



(EVAL) 



p,a,Cseif ^ {D,e) ^ce{D',v) a' = a\i^v 
p, S, Cseif I- {cr, D, I := e) ^cc {cr' , D') 

(ASGN-C) 



S^0 S' = {c|(£,c) G D} 
o- h e =»e V a-' = a|(;H^„ p h {a , D, S') {a" , D') 
p, a, 5 h (a, D, ^ := e) =>„c (a", D') 
if p(£) = reactive and cr'{£) ^ a{£) and o = false 
(ASGN-N2) 



o- h e =^e w (7 = cr\e^v 
p, a,D,S\- {cr, £ := e) =J>„c cr' 

if p{£) = normal or a'{£) = a{£) or o = true 
(ASGN-Nl) 



cr I- V 



P,a,Cgelf,D \- £ ^ce V 

(Deref-cI) 



if p{£) = normal 



D' = Du{{£,c,,if)} 



p,a,Cself I- {D,£) ^ce {D',v) 
(DEREF-C2) 



if p{£) = reactive 



P, Cself h {a,D,S,c) ^cc {(t',D',S'} 
p, Cseif I- {cr, D, S, begin.at c end_at) =J>cc {(t', D' , S') 
(BeginEnd-c) 



p,a,D\- {a, S, c) =^nc {^ ,S} 
p,a,D \- (cr, S, begin_at c end_at) =>nc 
(BeginEnd-n1) 



{a',S': 



if o = true 



5^0 a = true p, D h (o' , cr, S, c) ^nc {a',S') 
S" = S'u{c\{£,c)€DA a{£) ^ C7'{£) A p{£) = reactive } p h {cr' , D, S") 



{cr",D') . 



p,a,S \- {a, D, begin_at c end_at) =>nc {cr" , D') 
(BeginEnd-n2) 



if a = false 



g = S'^{c} ph {a,D,S') {a',D') 
p, a, S h {a,D, nevcons c) =>„c{o'',D') 
(NewCons-nI) 



if a = false 



D'=D\{{-,c)} S'^S\{c} 



S' ^SU {c} 



p,a,a,D h (5, newcons c) =r'nc S' 
(NewCons-n2) 



if a = true 



S' = SU {c} 



p,a,a \- {D, S,delcons c) - 
(DelCons-n) 



{D',S') 



p, fT, D, Csei/ I- (5, newcons c) =^cc S' 
(NewCons-C) 



D'=D\{{-,c)} S'^S\{c} 



p, CT, Ceeif \- {D,S, delcons c) 
(DelCons-C) 



{D',S') 



ph{a,D,S) {a,D) 
(Solver- 1) 



if 5 = 



p\- {a,D' ,S\{Cself},Cself,Cself) =>cc {a , D" , S'^ 

ph{a',D",S") =>. {a",D"') where: < 



ph {a,D,S) {a",D" 



Cseif = pick(5') 

D' = D\{{-,Cself)} 

S" = s' u{c\{e,c) eD" A 

a{£) a'{£) A p{£) = reactive} 



(Solver- 2) 



Figure 4. DWhile program evaluation. 
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If the location I to be read is reactive (rule DEREF-C2), a 
new dependency of the active constraint c^ej / from I is also 
added to the set D of dependencies. 

Executing Atomic Blocks. To execute an atomic block 
in normal mode (rule BeginEnd-n2), the uninterruptible 
command c is first evaluated according to the rules defined 
by transition =>„c- If the content of some reactive loca- 
tion changes due to the execution of c, the solver is then 
activated at the end of the block. The begin_at / end_at 
command has instead no effect when execution is already 
atomic, i.e., in constraint mode (rule BeginEnd-c) and in- 
side atomic blocks (rule BeginEnd-n1), except for execut- 
ing command c. 

Creating and Deleting Constraints. In non-atomic nor- 
mal execution mode, rule NewCons-n1 creates a new con- 
straint and triggers its first execution by resorting to =>s. 
In atomic normal execution and in constraint mode, rules 
NewC0NS-n2 and NewCons-C simply add the constraint 
to the scheduling queue. Similarly, rules DelCons-N and 
DelCONS-C remove the constraint from the scheduling 
queue and clean up its dependencies from D. 

Activating the Solver. Rules SOLVER- 1 and SOLVER-2 
specify the behavior of the scheduler, which is started by 
rules ASGN-N2 and BeginEnd-n2. Rule SOLVER- 1 de- 
fines the termination of the constraint solving phase: this 
phase ends only when there are no more constraints to be 
evaluated (i.e., S — 0). Rule SOLVER-2 has an inductive 
definition. If S is not empty, function pick selects from S a 
new active constraint Cseij, which is evaluated in constraint 
mode after removing from D its old dependencies. The fi- 
nal state (a") and dependencies {D'") are those obtained by 
applying the scheduler on the store a' obtained after the ex- 
ecution of Csei f and on a new set S" of constraints. S" is 
derived from S by adding any new constraints (S") result- 
ing from the execution of Csei / along with the constraints 
depending on reactive memory locations whose content has 
been changed by Cgeif- The definition of S" guarantees that 
constraints can trigger other constraints (even themselves), 
even if each constraint execution is regarded as an atomic 
operation and is never interrupted by the scheduler 

3. Convergence Properties 

In this section, we discuss some general properties of the 
constraint solving mechanism we adopt in DC, including 
termination, correctness, and running times. The computa- 
tion of one-way dataflow constraints (similarly to spread- 
sheet formulas, circuits, etc.) is traditionally described in 
the literature in terms of a bipartite directed graph called 
dataflow graph. In a dataflow graph, a node can model ei- 
ther an execution unit (e.g., gate |6], process ll23ll . one-way 
constraint |38], or spreadsheet formula ll28ll ) or an input/out- 
put port of one or more units (e.g., gate port, variable, or 
cell). There is an arc from a port to an execution unit if the 



unit uses that port as a parameter, and from an execution 
unit to a port if the unit assigns a value to that port. Paths 
in a dataflow graph, which is usually acyclic, describe how 
data flows through the system, and the result of a compu- 
tation can be characterized algorithmically in terms of an 
appropriate traversal of the graph (e.g., in a topological or- 
der). This model is very effective in describing scenarios 
where data dependencies are either specified explicitly, or 
can be derived statically from the program. However, in gen- 
eral the dataflow graph might be not known in advance or 
may evolve over time in a manner that may be difficult to 
characterize. In all such cases, proving general properties 
of programs based on the evaluation of the dataflow graph 
may not be easy. A more general approach, which we fol- 
low in our work, consists of modeling dataflow constraint 
solving as an iterative process that aims at finding a com- 
mon fixpoint for the current set of constraints. In our con- 
text, a fixpoint is a store that satisfies simultaneously all the 
relations between reactive memory locations specified by the 
constraints. This provides a unifying framework for solving 
dataflow constraint systems with both acyclic and cycUc de- 
pendencies. 

3.1 Independence of the Scheduling Order 

In Section 2, we have assumed that the scheduling order of 
constraint executions is specified by a function pick given 
as a parameter of the solver A natural question is whether 
there are any general properties of a set of constraints that 
let our solver terminate and converge to a common fix- 
point independently of the scheduling strategy used by func- 
tion pick. Using results from the theory of function itera- 
tions 11511 . we show that any arbitrary collection of inflation- 
ary one-way constraints has the desired property. This class 
of constraints includes, for instance, any program that can be 
described in terms of an acyclic dataflow graph such as com- 
putational circuits non-circular attribute grammars li29ll . 
and spreadsheets 12811 (see Section [J!2] l. We remark, how- 
ever, that it is more general as it allows it to address prob- 
lems that would not be solvable without cyclic dependencies 
(an example is given in Section [33b . 

We first provide some preliminary definitions in accor- 
dance with the terminology used in [7J. We model constraint 
execution as the application of functions on stores: 

Definition 1 . We denote by fc ■ ^ ^ ^ the function 
computed by a constraint c £ Cons, where fdcr) = cr' if 
{a, c) =>c cr'- We say that store cr G S is a FiXPOiNx/or 

'//c(cr) = CT. 

To simplify the discussion, throughout this section we as- 
sume that constraints only operate on reactive cells and fo- 
cus our attention on stores where all locations are reactive. 
The definition of inflationary functions assumes that a partial 
ordering is defined on the set of stores S: 
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Definition 2 (Inflationary Functions). Lef (S, ^) be 
any partial ordering over the set of stores E and let f : 
S — > E /je a function on E. We say that f is inflationary 
'/o' ^ f{<j)foralla G E. 

Examples of partial orderings on E will be given in Sec- 
tion 13.21 and in Section 13.31 A relevant property of partial 
orderings in our context is the finite chain property, based 
on the notion of sequence stabilization: 

Definition 3 (Finite Chain Property). A partial order- 
ing (E, ^) over E satisfies the Finite Chain Property if ev- 
ery non-decreasing sequence of elements ^ CTl ^ (T2 ^ 
. . . from E eventually stabilizes at some element a in E, i.e., 
if there exists j > such that ai ~ cr for all i > j. 

To describe the store modifications due to the execution of 
the solver, we use the notion of iteration of functions on 
stores. Let F — {/i, . . . , /„}, (oi, . . . , ak), and cr G E be a 
finite set of functions on E, a sequence of indices in [l,n], 
and an initial store, respectively. An iteration of functions of 
F starting at cr is a sequence of stores (crp, cri, 0-2, . . .) where 
ao = a and cr^ = /q. (cri_i) for i > 0. We say that function 
fai is activated at step i. Iterations of functions that lead to 
a fixed point are called regular: 

Definition 4 (Regular Function Iteration). A func- 
tion iteration (cro, cri, cr2, . . .) is regular if it satisfies the 
following property: for all f € F and i > 0, if <Ji is not a 
fixpoint for f, then f is activated at some step j > i. 

Using arguments from Chapter 7 of fl, it can be proved 
that any regular iteration of inflationary functions starting at 
some initial store stabilizes in a finite number of steps to a 
common fixpoint: 

Lemma 1 (Fixpoint). Let (E, ^) be any partial ordering 
over E satisfying the finite chain property and let F be a 
finite set of infiationary functions on S. Then any regular 
iteration of F starting at a eventually stabilizes at a common 
fixpoint cr' of the functions in F such that a ^ cr'. 

We can now discuss convergence properties of our solver: 

Theorem \ . Let C ~ {ci, . . . , c^} be any set of con- 
straints, let F — {/ci, . • . , /cfc} be the functions computed 
by constraints in C, and let (E, <) be any partial ordering 
over E satisfying the finite chain property. If functions in F 
are infiationary on E and {/ G | /(cr) ^ cr} C 5 C F, 
then {p, CT, D, S) (cr', D') and a' is a common fixpoint 
of the functions in F such that cr -< cr'. 

Proof (sketch). Consider the sequence (50,5*1,...) of 
scheduling sets resulting from a recursive application of rule 
SOLVER-2 terminated by rule SOLVER- 1 (see Figure [Hi, 
with 5o = 5. Let q = pick(5i) the constraint executed 
at step i, and let q = (cro, cri, cr2, . . .) be the function itera- 
tion such that aQ = a and cr,;+i = fc^{(Ji). We prove that 



q is regular Notice that 5o = 5 contains initially all func- 
tions for which cto is not a fixpoint. Furthermore, 5i+i is 
obtained from Si by removing Ci and adding at least all con- 
straints for which cr j+i is not a fixpoint. It remains to show 
that all constraints are activated at some step, i.e., they are 
eventually removed from 5. This can be proved by observ- 
ing that an inflationary function either leaves the store 
unchanged, and therefore |5| decreases by one, or produces 
a store (Ti+i = faic^i) strictly larger than ai, i.e., ai ^ cr^^i 
and ai ^ (Ji+i- By the finite chain property, this cannot hap- 
pen indefinitely, so 5 eventually gets empty. Since q is regu- 
lar, the proof follows from Lemma [T] □ 

Assuming that functions in Lemma[T]and Theorem[T]are also 
monotonic, it is possible to prove that the solver always con- 
verges to the least common fixpoint, yielding deterministic 
results independently of the scheduling order We recall that 
a function / is monotonic if cr ^ cr' implies /(cr) ^ /(cr') 
for all cr, cr' G E. 

3.2 Acyclic Systems of Constraints 

In this section we show that, if a system of constraints is 
acyclic, then our solver always converges deterministically 
to the correct result, without the need for programmers to 
prove any stabilization properties of their constraints. We 
notice that this is the most common case in many appli- 
cations, and several efficient techniques can be adopted by 
constraint solvers to automatically detect cycles introduced 
by programming errors llssl . In particular, we prove termi- 
nation of our solver on any system of constraints that models 
a computational circuit subject to incremental changes of its 
input. For the sake of simplicity, we focus on single-output 
constraints, to which any multi-output constraint can be re- 
duced. 

A circuit is a directed acyclic graph G — (V, E) with 
values computed at the nodes, referred to as gates [6]. Each 
node u is associated with an output function gu that com- 
putes a value val{u) = gu{val{vi), val{vd^)), where 
du is the indegree of node u and, for each i G [1, du\, arc 
{vi ,u) G E. Arcs entering u are ordered and, if du = 0, 
u is called an input gate (in this case, gu is constant). The 
gate values and functions may have any data types. For sim- 
plicity, we will assume that there is only one gate in the 
graph with outdegree 0: the value computed at this gate is 
the output of the circuit. The circuit value problem is to up- 
date the output of the circuit whenever the value of an input 
gate is changed. This problem is equivalent to the scenario 
where the gate values val{u) are reactive memory cells, and 
each non-input gate u is computed by a constraint c„ that 
assigns val{u) with gu{val{vi), val{vd,, j). A circuit up- 
date operation changes the value val{u) of any input gate u 
to a new constant. Any such update triggers the solver with 
5 — {c„' I (u, u') G E}. We now show that the solver up- 
dates correctly the circuit output. 



7 



2013/1/19 



Let n be the number of circuit nodes, and let wi , M2 , . . . , m„ 
be any topological ordering of the nodes, where m„ is the 
output gate of the circuit. Let cr £ S be a store with 
dom{a) — {val{ui)\i G [l;'^]}- We say that a value 
val{ui) is incorrect in cr if cr is not a fixpoint for c„;. Let 
K^) = Z^fci 2*-Xo-(ui)' wherexo-(wj) = 1 if value waZ (ui ) 
is incorrect in cr, and otherwise. We define a partial order- 
ing (S, ^) as follows: 

Definition 5. For any two stores a and cr' in E, we say 
that a < cr' ifh{a) > b{a'). 

Relation ^ is clearly reflexive, antisymmetric, and transitive. 
Moreover, the constraints Cu compute inflationary functions 
on S. Let cr' = fc^ (cr) be obtained by evaluating contraint 
Cu in store cr. If val{u) is correct in cr (i.e., x<y{'^) = 0)' then 
cr' = cr. Otherwise, Xaiu) = 1, Xcr'(") — 0, and Xo-(^) = 
Xa'{ii) for all gates u that precede u in the topological 
ordering. This implies that 6(ct) > b{fc^{a)), and thus 
cr ^ /cu(o')- Since fe(-) can assume only 2" possible values, 
(S, ^) also satisfies the finite chain property. After updating 
an input gate u, all constraints for which cr is no longer a 
fixpoint are included in the set S = {cu' \ {u,u') G E} 
on which the solver is started. Hence, all the hypotheses of 
Theorem [T] hold and the solver converges to store cr' G E 
that is a common fixpoint of all /c„, i.e., a store cr' in 
which all gate values are correct. This implies that the circuit 
output is also correct. It is not difficult to prove that, if we 
let function pick(S') return constraints in topological order, 
then all gates that may be affected by the input change are 
evaluated exactly once during the update. 

3.3 Cyclic Systems of Constraints: an Example 

Differently from previous approaches to solving one-way 
dataflow constraints |@, [13, lU 



2711 . which were targeted to 



insert(n, v, w): 

E := EU {{u,v)} 
w(u, v) := to 

newcons( if d[u] + w{u, v) < d[v] then d[v] := d[u] + w{u, v) ) 

d6crease(ii, v, S): 

w{u, v) ■— w{u, v) — S 



acyclic dependencies, our abstract machine can handle the 
most general case of cyclic constraints embedded within an 
imperative program. This opens up the possibility to address 
problems that would not be solvable using acyclic dataflow 
graphs, backed up with a formal machinery to help designers 
prove their convergence properties (Section fS.lt . We exem- 
plify this concept by considering the well known problem 
of maintaining distances in a graph subject to local changes 
to its nodes or arcs. In the remainder of this section we 
show how to specify an incremental variant of the classical 
Bellman-Ford's single-source shortest path algorithm [8] in 
terms of a (possibly cyclic) system of one-way constraints. 
Compared to purely imperative specifications fl8], the for- 
mulation of the incremental algorithm in our mixed impera- 
tive/dataflow framework is surprisingly simple and requires 
just a few lines of code. By suitably defining a partial order 
on E and an appropriate pick function, we show that our 
solver finds a correct solution within the best known worst- 
case time bounds for the problem. 



Figure 5. Incremental shortest path updates in a mixed im- 
perative/dataflow style. 

Incremental Shortest Paths. Let G = {V, E, w) be a di- 
rected graph with real edge weights w(u, v), and let s be a 
source node in V. We consider the incremental shortest path 
problem that consists of updating the distances d[u] of all 
nodes u V from the source s after inserting any new edge 
in the graph, or decreasing the weight of any existing edge. 
For the sake of simplicity we assume that no negative-weight 
cycles are introduced by the update and that, if a node u is 
unreachable from the source, its distance d[u] is +c>o. 

Update Algorithm. The incremental shortest path problem 
can be solved in our framework as follows. We keep edge 
weights and distances in reactive memory. Assuming to start 
from a graph with no edges, we initialize d[s] := and 
d[u] := +00 for all u ^ s. The pseudocode of update 
operations that insert a new edge and decrease the weight 
of an existing edge by a positive amount S are shown in 
Figure|5] Operation insert(u, v, w) adds edge {u, v) to the 
graph with weight w and creates a new constraint c„„ for the 
edge: c„„ simply relaxes the edge if Bellman's inequality 
d[u] + w{u,v) > d[v] is violated [8]. The constraint is 
immediately executed after creation (see rule NewCons- 
n1 in FigurelDl and the three pairs {d[u],Cuv), {d[v],Cuv), 
and {w{u,v), Cuv) are added to the set of dependencies D. 
Any later change to d[u], d[v] or v), which may violate 
the inequality d[u] + w(u,w) > d[v], will cause the re- 
execution of Cuv Decreasing the weight of an existing edge 
{u, v) by any positive constant S with decrease(u, v, S) can 
be done by just updating w{u, v). In view of rule Asgn-n2 
of Figure m the system reacts to the change and re-executes 
automatically Cuv and any other affected constraints. 

Using the machinery developed in Section 13.11 and suit- 
ably defining a partial order on E and an appropriate pick 
function, we now show that our solver finds a correct so- 
lution within the best known worst-case time bounds for the 
problem, i.e., it updates distances correctly after any insert 
or decrease operation. 

Termination and Correctness. For the sake of conve- 
nience, we denote by da-[u] — cr(c?[u]) and by Wa{u,v) = 
cr(w(u,w)) the distance of node u and the weight of edge 
(u, v) in store cr, respectively. For any two graphs Gi and G2 
on the same vertex set, we denote by Gi W G2 the multigraph 
with vertex set V and edge set Ei W where W indicates 
the join of multisets: the same edge may thus appear twice 
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in _Ei 1+) _E2, possibly with different weights. Let us focus our 
attention on a restricted set of stores, which encompasses all 
possible configurations of the reactive memory during the 
execution of the solver triggered by an update: 

Definition 6. Lef Gou = {V,Eoid,Woid) and G = 
(y, E, w) be the graph before and after inserting a new 
edge or decreasing the weight of an edge, respectively. We 
denote by Yigp Q S the set of all functions a : Loc — > Val 
such that: 

• dom{cr) = {d[u] |u G V^} U {w{u,v) \ {u,v) e E} <Z 
Loc; 

• for each u € V, [u] is the weight of a simple path ( i.e., 
with no repeated nodes) from s to u in Gold W G; 

• for each € E, Wa-{u,v) is fixed as the weight of 
edge {u,v) in G. 

Notice that, as simple paths in a graph are finite, the number 
of possible values each [u] can attain is finite, and there- 
fore is a finite set. We define a partial ordering (Ssp, di) 
on Egp as follows: 

Definition 7. Let a and a' be any two stores in Esp. We 
say that a < a' if da [u] > da' [u] for all u ^V. 

Relation < is reflexive, antisymmetric, and transitive. More- 
over, since Yigp is finite, (E^p,^) satisfies the finite chain 
property. We now prove that constraints Cuv compute infla- 
tionary functions on E^p. 

Lemma 2. Functions fc^^ computed by constraints Cuv of 
Figure\5\are infiationary with respect to the partial ordering 
(Esp, <) of Definition^ 

Proof. Let (w, v) be any edge in E and let a be any store 
in Esp. If (T = fc^^X'^) then clearly a < /c„„(o'). Con- 
sider the case a ^ fc^^i'^)- Notice that a and fc^^^{(j) 
can only differ in the value of memory location d[v]. Since 
there are no negative-weight cycles and da [u] is the weight 
of a simple path in Gold W G, then so is df^^^(a)[v] = 
da[u] + Wa{u,v). Furthermore, as Cuv never increases dis- 
tances, then df^^^^(a)[v] < da[v] . Therefore, a ^ /c„„(f). 
This shows that is inflationary. □ 

It is not difficult to see that, if all distances are correct before 
an update, then the solver is started on a store a G E^p. As 
all the hypotheses of Theorem [T] hold, the solver converges 
to store a' g Esp that is a common fixpoint of all /c„^. 
Therefore: (1) da'[uys are weights of simple paths in G, 
and (2) they satisfy all Bellman's inequalities. It is well 
known [5] that node labels satisfying both properties (1) and 
(2) are in fact the correct distances in the graph. 

Running Time. If we let function pick (5') return the con- 
straint Cuv £ S with the largest variation of d[u] due to the 
update, then we can adapt results from [18] and show that 
each Cuv is executed by the solver at most once per update. 
If pick uses a standard priority queue with 0{\ogn) time 



typedef void (*cons_t) (void*) ; 

int newcons(cons_t cons, void* param) ; 

void delcons(int cons_id) ; 

void* rmalloc (size_t size); 

void rfreeCvoid* ptr) ; 

void begin_at(); 

void end_at ; 

void ariii_f inal (int cons_id, cons_t final); 
void set_conip(int (*comp) (void* , void*)); 

Figure 6. Main functions of the DC language extension. 

per operation, then the solver updates distances incremen- 
tally in 0{m log n) worst-case time, even in the presence of 
negative edge weights (but no negative cycles). This can be 
reduced to 0{m + n\ogn) by just creating one constraint 
per node, rather that one constraint per edge, and letting it 
relax all outgoing edges. This matches the best known al- 
gorithmic bounds for the problem [|18l . We remark that, re- 
computing distances from scratch with the best known static 
algorithm would require 0{mn) time in the worst case if 
there are negative edge weights, and 0{m + nlogn) time 
otherwise. In Section POl we will analyze experimentally the 
performances of our constraint-based approach showing that 
in practice it can be orders of magnitude faster than recom- 
puting from scratch, even when all weights are non-negative. 

4. Embodiment into C/C++ 

In this section we show how to apply the concepts devel- 
oped in Section |2] to the C and C-n- languages, deriving an 
extension that we call DC. DC has exactly the same syntax 
as C/C-H-, but operations that read or modify objects have 
a different semantics. All other primitives, including creat- 
ing/deleting constraints, allocating/deallocating reactive ob- 
jects, and opening/closing atomic blocks, are provided as 
runtime library functions[j (see Figure|6]l. 

Reactive Memory Allocation. Similarly to other automatic 
change propagation approaches (e.g., 10,13), in DC all ob- 
jects allocated statically or dynamically are non-reactive by 
default. Reactive locations are allocated dynamically using 
library functions rmalloc and rf ree, which work just Hke 
malloc and free, but on a separate heap. 

Opening and Closing Atomic Blocks. Atomic blocks are 
supported in DC using two library functions begin_at and 
end_at. Calling begin_at opens an atomic block, which 
should be closed with a matching call to end_at. Nested 
atomic blocks are allowed, and are handled using a counter 
of nesting levels so that the solver is only resumed at the end 
of the outer block, processing any pending constraints that 
need to be first executed or brought up to date as a result of 
the block's execution. 

' A detailed documentation of tlie DC application programming in- 
terface, including stricter library naming conventions and several ad- 
ditional features not covered in this paper, is available at the URL: 
jhttp: //www.dis .uniromal . it/~ ^ demetres/dc/ 
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struct robject { 

void* operator newCsize_t size) { return rmalloc (size) ; > 
void operator delete (void* ptr) { rfreeCptr); > 

}; 

static void con_li(void*) , f in_hCvoid*) ; 

class rcons -C 
int id; 
public : 

virtual void consO = 0; 
virtual void final C) {} 
rconsO -[ id = -1; }■ 
"rconsO -[ disableO; } 
void enableC) { if (id == -1) id = newconsCcon_h, this); } 
void disableO { if (id != -1) { delcons(id); id = -1; } } 
void arm_final() { if (id != -1) arm_f inaKid, fin_h); } 
void unarm_f inalO i if (id != -1) arm_f inaKid, NULL); } 

}; 

void con_h(void* p) { ( (rcons*)p) ->cons () ; > 
void fin_h(void* p) { ( (rcons*)p) ->f inal () ; } 

Figure 7. C++ wrapping of DC primitives. 



Creating and Deleting Constraints. For the sake of sim- 
plicity, in Section 12] constraints have been modeled as ordi- 
nary commands. DC takes a more flexible approach: con- 
straints are specified as closures formed by a function that 
carries out the computation and a user-defined parame- 
ter to be passed to the function. Different constraints may 
therefore share the same function code, but have differ- 
ent user-defined parameters. New constraint instances can 
be created by calling newcons, which takes as parameters 
a pointer cons to a function and a user-defined parame- 
ter param. When invoked in non-atomic normal execution 
mode, newcons executes immediately function cons with 
parameter param, and logs all dependencies between the 
created constraint and the reactive locations read during the 
execution. If a constraint is created inside an atomic block 
(or inside another constraint), its first evaluation is deferred 
until the end of the execution of the current block (or con- 
straint). All subsequent re-executions of the constraint trig- 
gered by modifications of the reactive cells it depends on 
will be performed with the same value of param specified at 
the creation time, newcons returns a unique id for the cre- 
ated constraint, which can be passed to delcons to dispose 
of it. 

Reading and Modifying Objects. Reading and modifying 
objects in reactive memory can be done in DC by evaluating 
ordinary C/C++ expressions. We remark that no syntax ex- 
tensions or explicit macro/function invocations are required. 

Customizing the Scheduler. Differently from other ap- 
proaches p4], DC allows programmers to customize the 
execution order of scheduled constraints. While the default 
pick function of DC (which gives higher priority to least re- 
cently executed constraints) works just fine in practice for a 
large class of problems (see Section |2), the ability to replace 
it can play an important role for some specific problems, as 
we have seen in the incremental shortest paths example of 
Section |33] DC provides a function set_comp that installs 
a user-defined comparator to determine the relative priority 
of two constraints. The comparator receives as arguments the 
user-defined parameters associated with the constraints to be 
compared. 

Final Handlers. An additional feature of DC, built on top 
of the core constraint handling mechanisms described in 
Section ID is the ability to perform some finalization oper- 
ations only when the results of constraint evaluations are 
stable, i.e., when the solver has found a common fixpoint. 
For instance, a constraint computing the attribute of a wid- 
get in a graphic user interface may also update the screen by 
calling drawing primitives of the GUI toolkit: if a redrawing 
occurs at each constraint execution, this may cause unnec- 
essary screen updates and flickering effects. Another usage 
example of this feature will be given in Section l53] 

DC allows users to specify portions of code for a con- 
straint to be executed as final actions just before resuming 



the underlying imperative program interrupted by the solver 
activation. This can be done by calling function arni_f inal 
during constraint solving: the operation schedules a final 
handler to be executed at the end of the current solving ses- 
sion. The function takes as parameters a constraint id and a 
pointer to a final handler, or NULL to cancel a previous re- 
quest. A final handler receives the same parameter as the 
constraint it is associated to, but no dependencies from reac- 
tive locations are logged during its execution. All final han- 
dlers are executed in normal execution mode as a whole in- 
side an atomic block. 

C++ Wrapping of DC Primitives. The examples in the re- 
mainder of this paper are based on a simple C++ wrapping 
of the DC primitives, shown in Figure |7] We abstract the 
concepts of reactive object and constraint using two classes: 
robject and rcons. The former is a base class for objects 
stored in reactive memory. This is achieved by overloading 
the new and delete operators in terms of the correspond- 
ing DC primitives rmalloc and rf ree, so that all member 
variables of the object are reactive. Class rcons is a virtual 
base class for objects representing dataflow constraints. The 
class provides a pure virtual function called cons, to be de- 
fined in subclasses, which provides the user code for a con- 
straint. An additional empty final function can be option- 
ally overridden in subclasses to define the finalization code 
for a constraint. The class also provides functions enable 
and disable to activate/deactivate the constraint associated 
with the object, and functions arm_f inal and unarmjfinal 
to schedule/unschedule the execution of final handlers. 

5. Applications and Programming Examples 

In this section, we discuss how DC can improve C/C++ 
programmability in three relevant application scenarios. To 
the best of our knowledge, these applications have not been 
explored before in the context of dataflow programming. All 
the code we show is real. 
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template<typename T> struct node ; robject, rcons { 
enum op_t { SUM, PROD }; 
T val; 
op_t op; 

node *left, *right; 
node(Tv): val(v), left (NULL) 
node(op_t o) : op(o) , left(NULL) 
void consC) { 

if (left == NULL I I right == 
switch (op) { 

case SUM: val 
case PRQD: val 

} 

} 

}; 



right (NULL) { enable (); > 
right (NULL) { enable (); } 

NULL) return; 



left->val + right->val; break; 
left->val * right->val; break; 



Figure 8. Incremental evaluation of expression trees. 
5.1 Incremental Computation 

In many applications, the input data is subject to continuous 
updates that need to be processed efficiently. For instance, in 
a networking scenario, routers must react quickly to link fail- 
ures by updating routing tables in order to minimize commu- 
nication delays. When the input is subject to small changes, 
a program may fix incrementally only the portion of the out- 
put affected by the update, without having to recompute the 
entire solution from scratch. For many problems, efficient ad 
hoc algorithms are known that can update the output asymp- 
totically faster that recomputing from scratch, delivering in 
practice speedups of several orders of magnitude 
Such dynamic algorithms, however, are typically difficult to 
design and implement, even for problems that are easy to 
be solved from-scratch. A language-centric approach, which 
was extensively explored in both functional and imperative 
programming languages, consists of automatically turning a 
conventional static algorithm into an incremental one, by se- 
lectively recomputing the portions of a computation affected 
by an update of the input. This powerful technique, known 
as self-adjusting computation 101, provides a principled way 
of deriving efficient incremental code for several problems . 
We now show that dataflow constraints can provide an effec- 
tive alternative for specifying incremental programs. Later 
in this section we discuss differences and similarities of the 
two approaches. 

Example. To put our approach into the perspective of pre- 
vious work on self-adjusting computation, we revisit the 
problem of incremental re-evaluation of binary expression 
trees discussed in 0251]. This problem is a special case of 
the circuit evaluation described in Section 13.21 input val- 
ues are stored at the leaves and the value of each internal 
node is determined by applying a binary operator (e.g., sum 
or product) on the values of its children. The final result of 
the evaluation is stored at the root. We start from the con- 
ventional node structure that a programmer would use for a 
binary expression tree, containing the type of the operation 
computed at the node (only relevant for internal nodes), the 
node's value, and the pointers to the subtrees. Our DC-based 
solution (see Figure [8]l simply extends the node declaration 
by letting it inherit from classes rob j ect and rcons, and by 



providing the code of a constraint that computes the value of 
the node in terms of the values stored at its children. Ev- 
erything else is exactly what the programmer would have 
done anyway to build the input data structure. An expression 
tree can be constructed by just creating nodes and connect- 
ing them in the usual way: 

node<int> *root = new node<int> (node<int> : : SUM) ; 
root->left = new node<int> (10) ; 

root->right = new node<int> (node<int> : :PROD) ; 

root->right->lef t = new node<int> (2) ; 
root->right->rlght = new node<int> (6) ; 

The example above creates the tree shown in Figure |9] (left). 
Since all fields of the node are reactive and each node 
is equipped with a constraint that computes its value, at 
any time during the tree construction, root->value con- 
tains the correct result of the expression evaluation. We 
remark that this value not only is given for free without 
the need to compute it explicitly by traversing the tree, 
but is also updated automatically after any change of the 
tree. For instance, changing the value of the rightmost leaf 
with root->right->right->val = 3 triggers the propaga- 
tion chain shown in Figure |9] (right). Other possible updates 
that would be automatically propagated include changing 
the operation type of a node or even adding/removing entire 
subtrees. Notice that a single change to a node may trigger 
the re-execution of the constraints attached to all its ances- 
tors, so the total worst-case time per update is 0{h), where 
h is the height of the tree. For a balanced expression tree, 
this is exponentially faster than recomputing from scratch. 
If a batch of changes are to be performed and only the final 
value of the tree is of interest, performance can be improved 
by grouping updates with begin_at() and end_at() so that 
the re-execution of constraints is deferred until the end of 
the batch, e.g.: 

begin_at(); // put the solver to sleep 

root->op = node<int> : : SUM; // change node operation type 
delete root->right->lef t // delete leaf 

// etc. . . 

end_at(); // wake up the solver 

Discussion. DC and imperative self-adjusting computa- 
tion languages such as CEAL [4J share the basic idea of 
change propagation, and reactive memory is very similar to 
CEAL's modifiables. However, the two approaches differ in 
a number of important aspects. In CEAL, the solution is ini- 
tially computed by a core component and later updated by a 
mutator, which performs changes to the input. In DC there is 
no explicit distinction between an initial run and a sequence 
of updates, and in particular there is no static algorithm that 
is automatically dynamized. Instead, programmers explicitly 
break down the solution to a complex problem into a col- 
lection of reactive code fragments that locally update small 
portions of the program state as a function of other portions . 
This implies a paradigm shift that may be less straightfor- 
ward for the average programmer than writing static algo- 
rithms, but it can make it easier to exploit specific properties 
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Figure 9. Reactive expression tree (left) and change propagation chain after a leaf value update (right). 



of the problem at hand, which in some cases can be crucial 
for coding algorithms provably faster than recomputing from 
scratch. 

Traceable data types ['T] have been recently introduced to 
extend the class of static algorithms that can be handled ef- 
ficiently by self-adjusting computation: in a traceable data 
type, dependencies are tracked at the level of data structure 
operations, rather than individual memory locations, com- 
bining in a hybrid approach the benefits of automatic change 
propagation and those of ad hoc dynamic data structures. 
This can yield large asymptotic gains in dynamizing static 
algorithms that use basic abstract data types, such as dictio- 
naries or priority queues. However, it is not clear that every 
conventional static algorithm can be effectively dynamized 
in this way: for some complex problems, it may be neces- 
sary to implement an ad hoc traceable data structure that per- 
forms all the incremental updates, thus missing the advan- 
tages of automatic change propagation and thwarting the au- 
tomatic incrementalization nature of self-adjusting computa- 
tion. In contrast, dataflow constraints can explicitly deal with 
changes to the state of the program (when this is necessary to 
obtain asymptotic benefits), incorporating change-awareness 
directly within the code controlled by the change propaga- 
tion algorithm without requiring traceable data structures. 

5.2 Implementing tlie Observer Design Pattern 

As a second example, we show how the reactive nature of 
our framework can be naturally exploited to implement the 
observer software design pattern. A common issue arising 
from partitioning a system into a collection of cooperating 
software modules is the need to maintain consistency be- 
tween related objects. In general, a tight coupling of the 
involved software components is not desirable, as it would 
reduce their reusability. For example, graphical user inter- 
face toolkits almost invariably separate presentational as- 
pects from the underlying application data management, al- 
lowing data processing and data presentation modules to 
be reused independently. The observer software design pat- 



tern fl'V\ answers the above concerns by defining one-to- 
many dependencies between objects so that when one ob- 
ject (the subject) changes state, all its dependents (the ob- 
servers) are automatically notified. A key aspect is that sub- 
jects send out notifications of their change of state, without 
having to know who their observers are, while any number 
of observers can be subscribed to receive these notifications 
(subjects and observers are therefore not tightly coupled). A 
widely deployed embodiment of this pattern is provided by 
the Qt application development framework fl^. 

Qt is based on a signal-slot communication mechanism: 
a signal is emitted when a particular event occurs, whereas a 
slot is a function that is called in response to a particular sig- 
nal. An object acting as a subject emits signals in response 
to changes of its state by explicitly calling a special mem- 
ber function designated as a signal. Observers and subjects 
can be explicitly connected so that any signal emitted by a 
subject triggers the invocation of one or more observer slots. 
Programmers can connect as many signals as they want to a 
single slot, and a signal can be connected to as many slots 
as they need. Since the connection is set up externally af- 
ter creating the objects, this approach allows objects to be 
unaware of the existence of each other, enhancing informa- 
tion encapsulation and reuse of software components. Sub- 
jects and observers can be created in Qt as instances of the 
QObject base class. Qt's signal-slot infrastructure hinges 
upon an extension of the C++ language with three new key- 
words: signal and slot, to designate functions as signals 
or slots, and emit, to generate signals. 

A Minimal Example: Qt vs. DC. To illustrate the concepts 
discussed above and compare Qt and DC as tools for imple- 
menting the observer pattern, we consider a minimal exam- 
ple excerpted from the Qt 4 . 6 reference documentation. The 
goal is to set up a program in which two counter variables a 
and b are connected together so that the value of b is auto- 
matically kept consistent with the value of a. The example 
starts with the simple declaration shown in Figure [Toja) (all 
except the framed box), which encapsulates the counter into 
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an object with member functions value/setValue for ac- 
cessing/modifying it. Figure [TOfb) shows how the Counter 
class can be modified in Qt so that counter modifications 
can be automatically propagated to other objects as pre- 
scribed by the observer pattern. First of all, the class inherits 
from Qt's QObject base class and starts with the Q_OBJECT 
macro. Function set Value is declared as a slot and it is 
augmented by calling explicitly the valueChanged signal 
with the emit keyword every time an actual change occurs. 
Since Qt Counter objects contain both signal and slot func- 
tions they can act both as subjects and as observers. The fol- 
lowing code snippet shows how two counters can be created 
and connected so that each change to the former triggers a 
change of the latter: 

Counter *a = new Counter, *b = new Counter; 

QObject: :connect(a, SIGNAL(valueChanged(lnt) ) , 
b, SLOT(setValue(int))) ; 



a->setValue(12) ; 
b->setValue(48) ; 



// a->value() == 
// a->value() == 



12, 
12, 



b->value() == 12 
b->value() == 48 



The QObject : : connect call installs a connection between 
counters a and b: every time emit valueChcinged (value) 
is issued by a with a given actual parameter, s et Value (int 
value) is automatically invoked on b with the same param- 
eter Therefore, the call a->setValue(12) has as a side- 
effect that the value of b is also set to 12. Conversely, the 
call b->setValue (48) entails no change of a as no con- 
nection exists from b to a. 

The same result can be achieved in DC by just letting 
the Counter class of Figure [TOl a) inherit from the rob j ect 
base class of Figure]?] As a result, the m_value member vari- 
able is stored in reactive memory. The prescribed connection 
between reactive counters can be enforced with a one-way 
dataflow constraint that simply assigns the value of b equal 
to the value of a: 

Counter *a = new Counter, *b = new Counter; 

struct C : rcons { 
Counter *a, *b; 

C(Counter *a. Counter *b) : a(a) , b(b) { enable () ; } 
void consO { b->setValue(a->value()) ; } 
} c(a,b); 

a->setValue(12) ; // a->value() == 12, b->value() == 12 
b->setValue(48) ; // a->value() == 12, b->value() == 48 

We notice that the role of the QObject: : connect of the 
Qt implementation is now played by a dataflow constraint, 
yielding exactly the same program behavior. 

Discussion. The example above shows that DCs run- 
time system handles automatically a number of aspects that 
would have to be set up explicitly by the programmers using 
Qt's mechanism: 

• there is no need to define slots and signals, relieving 
programmers from the burden of extending the definition 



public r object { 



class Counter 
public : 

Counter { m_value =0; } 
int value () const { return in_value; } 
void setValue(int value) { in_value = value; } 
private : 

int in_value; 

}; 

(a) A counter class and its DC obsei'ver pattern version (framed box). 



class Counter : public QObject { 

Q_OBJECT 
public : 

Count er() { m_ value =0; } 

int value const { return m_ value; } 
public slots: 

void setValue(int value); 
signals : 

void valueChanged(int newValue) ; 
private : 

int in_value ; 

}; 

void Counter :: setValue (int value) { 
if (value != m_value) { 
in_value = value ; 
emit valueChanged(value) ; 

} 



} 



(b) Qt observer pattern version of the counter class. 



Figure 10. Observer pattern example excerpted from the Qt 
4.6 reference documentation: DC vs. Qt implementation. 

of subject and observer classes with extra machinery (see 
Figure [Toll: 

• only actual changes of an object's state trigger propaga- 
tion events, so programmers do not have to make explicit 
checks such as in Counter: :setValue's definition to 
prevent infinite looping in the case of cyclic connections 
(see Figure [TOl b)): 

• DC does not require extensions of the language, and thus 
the code does not have to be preprocessed before being 
compiled. 

We sketch below further points that make dataflow con- 
straints a flexible framework for supporting some aspects 
of component programming, putting it into the perspective 
of mainstream embodiments of the observer pattern such as 
Qt: 

• in DC, only subjects need to be reactive, while observers 
can be of any C-H- class, even of third-party Ubraries dis- 
tributed in binary code form. In Qt, third-party observers 
must be wrapped using classes equipped with slots that 
act as stubs; 

• relations between Qt objects are specified by creating ex- 
plicitly one-to-one signal-slot connections one at a time; 
a single DC constraint can enforce simultaneously any 
arbitrary set of many-to-many relations. Furthermore, 
as the input variables of a dataflow constraint are de- 
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template<class T, class N> class snode ; public rcons { 

1 map<N**, snode<T,N>*> *m; 

2 N *head, **tail; 

3 snode *next ; 

4 int refc; 

5 public : 

6 snode (N *h, N **t, map<N**, snode<T,N>*> *m) : 

7 m(m), head(h), tail(t), next(NULL), refc(O) { 

8 (*m) [tail] = this; 

9 enable C) ; 

10 } 

11 "snodeO { 

12 m->erase (tail) ; 

13 if (next != NULL kk — next->refc == 0) delete next; 

14 } 

15 void consO { 

16 snode<T,N>+ cur_next; 
17 

18 if (*tall != NULL) { 

19 typename map<N**, snode<T,N>*> iterator it = 

20 m->find( a(*tail)->next ); 

21 if (it != m->end()) 

22 cur_next = it->second; 

23 else cur_next = new snode<T,N>(*tail, 

24 &(*tail)->next, m) ; 

25 } else cur_next = NULL; 
26 

27 if (next != cur_next) { 

28 if (next != NULL kk — next->refc == 0) 

29 next->arm_f inal () ; 

30 if (cur_next != NULL kk cur_next->ref C++ == 0) 

31 cur_next->unariii_f inalO ; 

32 next = cur_next ; 

33 } 

34 if (head != NULL) T: :Hatch(head) ; 

35 } 

36 void final () -( delete this; } 

37 }; 

38 template<class T, class N> class watcher { 

39 snode<T,N> *gen; 

40 map<N**, snode<T,N>*> m; 

41 public; 

42 watcher(N** h) { gen = new snode<T , N> (NULL , h, km.); } 

43 "watcherO { delete gen; } 

44 }; 



Figure 11. Data structure checking and repair: list watcher. 



tected automatically, relations may change dynamically 
depending on the state of some objects; 

• Qt signal-slot connections let subjects communicate val- 
ues to their observers; DC constraints can compute the 
values received by the observers as an arbitrary function 
of the state of multiple subjects, encapsulating complex 
update semantics; 

• in Qt, an object's state change notification can be de- 
ferred by emitting a signal until after a series of state 
changes has been made, thereby avoiding needless inter- 
mediate updates; DC programmers can control the gran- 
ularity of change propagations by temporarily disabling 
constraints and/or by using begin_at/end_at primitives. 

5.3 Data Structure Checking and Repair 

Long-living applications inevitably experience various forms 
of damage, often due to bugs in the program, which could 
lead to system crashes or wrong computational results. 
The ability of a program to perform automatic consistency 
checks and self-healing operations can greatly improve re- 



liability in software development. One of the most common 
causes of faults is connected with different kinds of data 
structure corruptions, which can be mitigated using data 
structure repair techniques [21 ]. 

In this section, we show how dataflow constraints can be 
used to check and repair reactive data structures. We ex- 
emplify this concept by considering the simple problem of 
repairing a corrupt doubly-linked list llsoll . We first show 
how to build a generic list watcher, which is able to de- 
tect any changes to a list and perform actions when modi- 
fications occur. This provides an advanced example of DC 
programming, where constraints are created and destroyed 
by other constraints. Differently from the expression trees 
of Section lsTl where constraints are attributes of nodes, the 
main challenge here is how to let the watched list be com- 
pletely unaware of the watcher, while still maintaining auto- 
matically a constraint for each node. The complete code of 
the watcher is shown in Figure [TT] The only assumpion our 
watcher makes on list nodes to be monitored (of generic type 
N) is that they are reactive and contain a next field pointing to 
the successor. The main idea is to maintain a shadow list of 
constraints that mirrors the watched list (Figure [TZi. Shadow 
nodes are snode objects containing pointers to the monitored 
nodes (head) and to their next fields (tail). A special gener- 
ator shadow node (gen) is associated to the reactive variable 
(list) holding the pointer to the first node of the input list. 
A lookup table (m) maintains a mapping from list nodes to 
the corresponding shadow nodes. The heart of the watcher is 
the constraint associated with shadow nodes (lines 15-35). 
It first checks if the successor of the monitored node, if any, 
is already mapped to a shadow node (lines 18-21). If not, it 
creates a new shadow node (line 23). Lines 27-33 handle the 
case where the successor of the shadow node has changed 
and its next field has to be updated. Line 34 calls a user- 
defined watch function (provided by template parameter T), 
which performs any desired checks and repairs for an input 
list node. To dispose of shadow nodes when the correspond- 
ing nodes are disconnected from the list, we use a simple 
reference counting technique, deferring to a final handler the 
task of deallocating dead shadow nodes (line 36). 

The following code snippet shows how to create a simple 
repairer for a doubly-linked list based on the watcher of 
Figure nn 

struct node : robject { int val; node *next, *prev; }; 

struct myrepairer { 

static void watchCnode* x) { 
// check 

if (x->next != NULL && x ! = x->next->prev) 
// repair 
x->next->prev = x; 

} 

}; 

// create reactive list head and repairer 
node** list = . . . ; 

watcher<inyrepairer ,node> rep(list) ; 
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Figure 13. DCs software architecture. 

// manipulate the list 

The repairer object rep checks if the invariant property x 
== x->next->prev is satisfied for all nodes in the list, and 
recovers it to a consistent state if any violation is detected 
during the execution of the program. We notice that several 
different watchers may be created to monitor the same list. 

6. Implementation 

In this section we discuss how DC can be implemented via 
a combination of runtime libraries, hardware/operating sys- 
tem support, and dynamic code patching, without requiring 
any source code preprocessing. The overall architecture of 
our DC implementation, which was developed on a Linux 
IA-32 platform, is shown in Figure [T3] At a very high level, 
the DC runtime library is stratified into two modules: 1) a 
reactive memory manager, which defines the rmalloc and 
rf ree primitives and provides support for tracing accesses 
to reactive memory locations; 2) a constraint solver, which 
schedules and dispatches the execution of constraints, keep- 
ing track of dependencies between reactive memory loca- 
tions and constraints. We start our description by discussing 
how to support reactive memory, which is the backbone of 
the whole architecture. 

6.1 Reactive Memory 

Taking inspiration from transactional memories lUl, we im- 
plemented reactive memory using off-the-shelf memory pro- 



tection hardware. Our key technique uses access violations 
(AV) combined with dynamic binary code patching as a 
basic mechanism to trace read/write operations to reactive 
memory locations. 

Access Violations and Dynamic Code Patching. Reactive 
memory is kept in a protected region of the address space so 
that any read/write access to a reactive object raises an AV. 
Since access violation handling is very inefficient, we use it 
just to detect incrementally instructions that access reactive 
memory. When an instruction x first tries to access a reac- 
tive location, a segmentation fault with offending instruction 
X is raised. In the SIGSEGV handler, we patch the trace t 
containing x by overwriting its initial 5 bytes with a jump to 
a dynamically recompiled trace t' derived from t, which is 
placed in a code cache. In trace t' , x is instrumented with ad- 
ditional inline code that accesses reactive locations without 
generating AVs, and possibly activates the constraint solver. 
Trace t' ends with a jump that leads back to the end of t so 
that control flow can continue normally in the original code. 
Since t' may contain several memory access instructions, it 
is re-generated every time a new instruction that accesses 
reactive memory is discovered. To identify traces in the 
code, we analyze statically the binary code when it is loaded 
and we construct a lookup table that maps the 
address of each memory access instruction to 
the trace containing it. To handle the cases 
where a trace in a function / is smaller than 5 
bytes and thus cannot be patched, we overwrite 
the beginning of / with a jump to a new version 
/' of / where traces are padded with trailing 
nop instructions so that the smallest trace is at 
least 5-bytes long. 

Shadow Memory and Address Redirecting. 

To avoid expensive un-protect and re-protect 
page operations at each access to reactive mem- 
ory, we mirror reactive memory pages with un- 
protected shadow pages that contain the actual 
data. The shadow memory region is kept un- 
der the control of our reactive memory alloca- 
tor, which maps it onto physical frames with 
the mmap system call. Any access to a reactive 
object is transparently redirected to the corre- 
sponding object in the shadow memory. As a 
result, memory locations at addresses within 
the reactive memory region are never actually 
read or written by the program. To avoid wast- 
ing memory without actually accessing it, re- 
active memory can be placed within the Kernel 
space, located in the upper 1GB of the address 
space on 32-bit Linux machines with the clas- 
sical 3/1 virtual address split. Kernel space is flagged in the 
page tables as exclusive to privileged code (ring 2 or lower), 
thus an AV is triggered if a user-mode instruction tries to 
touch it. More recent 64-bit platforms offer even more flexi- 
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bility to accomodate reactive memory in protected regions 
of the address space. We let the reactive memory region 
start at address 2"^° + 2^^ OxCOOOOOO and grow upward 
as more space is needed (see the figure on the right). The 
shadow memory region starts at address 2^^ = 0x8000000 
and grows upward, eventually hitting the memory mapping 
segment used by Linux to keep dynamic libraries, anony- 
mous mappings, etc. Any reactive object at address x is mir- 
rored by a shadow object at address x — S, where S ~ 2'^'^ = 
0x4000000 is a fixed offset. This makes address redirecting 
very efficient. 

6.2 Constraint Solver 

Our implementation aggregates reactive locations in 4-byte 
words aligned at 32 bit boundaries. The solver is activated 
every time such a word is read in constraint execution mode, 
or its value is modified by a write operation. The main 
involved units are (see Figure [iJli: 

1. A dispatcher that executes constraints, maintaining a 
global timestamp that grows by one at each constraint 
execution. For each constraint, we keep the timestamp of 
its latest execution. 

2. A memory access logger that maintains the set of depen- 
dencies D and a list W of all reactive memory words 
written by the execution of the current constraint Cgeij, 
along with their initial values before the execution. To 
avoid logging information about the same word multi- 
ple times during the execution of a constraint, the logger 
stamps each word with the time of the latest constraint 
execution that accessed it. Information is logged only if 
the accessed word has a timestamp older than the current 
global timestamp, which can only happen once for any 
constraint execution. To represent D, the logger keeps for 
each word v the address of the head node of a linked list 
containing the id's of constraints depending upon v. 

3. A constraint scheduler that maintains the set of sched- 
uled constraints S. By default 5 is a priority queue, where 
the priority of a constraint is given by the timestamp of 
its latest execution: the scheduler repeatedly picks and 
lets the dispatcher execute the constraint with the highest 
priority, until S gets empty. Upon completion of a con- 
straint's execution, words are scanned and removed from 
W: for each v £ W whose value has changed since the 
beginning of the execution, the constraint id's in the list 
of nodes associated with v are added to S, if not already 
there. 

Nodes of the linked lists that represent D and data struc- 
tures S and W are kept in contiguous chunks allocated with 
malloc. To support direct lookup, timestamps and depen- 
dency list heads for reactive memory words are stored in a 
contiguous reactive memory info region that starts at address 
2^^ = 0x8000000 and grows downward, eventually hitting 
the heap's brk. 



A critical aspect is how to clean up old dependencies in 
D when a constraint is re-evaluated. To solve the problem 
efficiently in constant amortized time per list operation, we 
keep for each node its insertion time into the linked list. 
We say that a node is stale if its timestamp is older than 
the timestamp of the constraint it refers to, and up to date 
otherwise. Our solver uses a lazy approach and disposes of 
stale nodes only when the word they refer to is modified and 
the linked list is traversed to add constraints to S. To prevent 
the number of stale nodes from growing too large, we use an 
incremental garbage collection technique. 

7. Experimental Evaluation 

In this section we present an experimental analysis of the 
performances of DC in a variety of different settings, show- 
ing that our implementation is effective in practice. 

7.1 Benchmark Suite 

We have evaluated DC on a set of benchmarks that includes 
a variety of problems on lists, grids, trees, matrices, and 
graphs, as well as full and event-intensive interactive appli- 
cations. 

• Linked Lists. We considered several fundamental prim- 
itives on linear linked data structures, which provide a 
variety of data manipulation patterns. Our benchmarks 
include data structures for: computing the sum of the ele- 
ments in a list (adder), filtering the items of a list accord- 
ing to a given function (filter), randomly assigning 
each element of a list to one of two output lists (halver), 
mapping the items of a list onto new values according to 
a given mapping function (mapper), merging two sorted 
lists into a single sorted output list (merger), produc- 
ing a sorted version of an input list (msorter), produc- 
ing a reversed version of an input list (reverser); split- 
ting a list into two output lists, each containing only ele- 
ments smaller or, respectively, greater than a given pivot 
(splitter). All benchmarks are subject to operations 
that add or remove nodes from the input lists. 

• Graphs and Trees. Benchmarks in this class include clas- 
sical algorithmic problems for routing in networks and 
tree computations: 

■ sp: given a weighted directed graph and a source node 
s, computes the distances of all graph nodes from s. 
Graph edges are subject to edge weight decreases (see 
Section[33]l. 

■ exptrees: computes the value of an expression tree 
subject to operations that change leaf values or opera- 
tors computed by internal nodes (see Section |5Tt . 

• Linear Algebra. We considered number-crunching prob- 
lems on vectors and matrices, including the product of a 
vector and a matrix (vecmat) and matrix multiplication 
(matmat), subject to different kinds of updates of single 
cells as well as of entire rows or columns. 



16 



2013/1/19 





From-scratch time 

(sees) 


Propagation time 
(msecs) 


Mem peak usage 
(Mbytes) 


DC statistics 


Benchmark 


conv 


dc 


ceal 


dc 
conv 


ceal 
conv 


ceal 
dc 


dc 


ceal 


ceal 
dc 


dc 


ceal 


ceal 
dc 


avg cons 
per update 


mstr 
time 


patched 
instr 


adder 


0.10 


1.44 


1.40 


14.40 


14.00 


0.97 


0.68 


85.80 


126.17 


211.54 


232.87 


1.10 


1.5 


0.030 


26 


exptrees 


0.14 


1.02 


1.07 


7.28 


7.64 


1.04 


4.11 


5.46 


1.32 


143.30 


225.32 


1.57 


15.6 


0.028 


72 


filter 


0.19 


2.08 


1.11 


10.94 


5.84 


0.53 


0.63 


2.49 


3.95 


265.78 


189.47 


0.71 


0.5 


0.032 


39 


halver 


0.20 


2.08 


1.33 


10.40 


6.65 


0.63 


0.61 


3.95 


6.47 


269.10 


218.22 


0.81 


0.5 


0.030 


38 


mapper 


0.19 


2.04 


1.30 


10.73 


6.84 


0.63 


0.61 


2.63 


4.31 


261.53 


214.34 


0.81 


0.5 


0.032 


39 


merger 


0.19 


2.12 


1.37 


11.15 


7.21 


0.64 


0.66 


4.43 


6.71 


284.41 


218.21 


0.81 


0.5 


0.031 


57 


msorter 


0.91 


5.18 


3.91 


5.69 


4.29 


0.75 


5.55 


15.91 


2.86 


689.59 


820.14 


1.18 


37.6 


0.031 


75 


reverser 


0.18 


2.04 


1.30 


11.33 


7.22 


0.63 


0.62 


2.63 


4.24 


267.45 


214.34 


0.80 


0.5 


0.030 


37 


splitter 


0.18 


2.27 


1.31 


12.61 


7.27 


0.57 


1.54 


3.92 


2.54 


344.60 


222.34 


0.64 


1.5 


0.031 


56 



Table 1. Performance evaluation of DC versus CEAL, for a common set of benchmarks. Input size is rt = 1, 000, 000 for all 
tests except msorter, for which n = 100, 000. 



Update times - Mapper benchmark Total update times - Adder benchmark Performance ratio CEAL/DC - Adder benchmark 
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Percentage of input changed Number of nodes x 1 00000 Number of nodes x 1 00000 

(a) (b) (c) 

Figure 14. (a) Change propagation times on the mapper benchmark for complex updates with input size n — 100, 000; (b-c) 
performance comparison of the change propagation times of DC and CEAL on the adder benchmark. 



• Interactive Applications. We considered both full real ap- 
plications and synthetic worst-case scenarios, including: 

■ Othello: full application that implements the well- 
known board game in which two players in turn place 
colored pieces on a square board, with the goal of re- 
versing as many of their opponent's pieces as possi- 
ble; 

■ buttongrid: event-intensive graphic user interface 
application with a window containing n x n push 
buttons embedded in a grid layout. This is an extreme 
artificial scenario in which many events are generated, 
since a quadratic number of buttons need to be resized 
and repositioned to maintain the prescribed layout at 
each interactive resize event. 

Some benchmarks, such as matmat and sp, are very com- 
putationally demanding. For all these benchmarks we have 
considered an implementation based on DC, obtained by 
making the base data structures (e.g., the input list) reac- 
tive, and a conventional implementation in C based on non- 
reactive data structures. Interactive applications (othello 
and buttongrid) are written in the Qt-4 framework: change 
propagation throughout the GUI is implemented either us- 
ing constraints (DC versions), or using the standard signal- 



slot mechanism provided by Qt (conventional versions). To 
assess the performances of DC against competitors that 
can quickly respond to input changes, we have also con- 
sidered highly tuned ad-hoc dynamic algorithms 36] 
and incremental solutions realized in CEAL f23\, a state- 
of-the-art C-based language for self-adjusting computation. 
Benchmarks in common with CEAL are adder, exptrees, 
filter, halver, mapper, merger, msorter, reverser, 
and splitter. For these benchmarks, we have used the 
optimized implementations provided by Hammer et al. ll25ll . 

7.2 Performance Metrics and Experimental Setup 

We tested our benchmarks both on synthetic and on real test 
sets, considering a variety of performance metrics: 

• Running times: we measured the time required to initial- 
ize the data structures with the input data (from-scratch 
execution), the time required by change propagation, and 
binary code instrumentation time. All reported times are 
wall-clock times, averaged over three independent trials. 
Times were measured with gettimeof day (), turning 
off any other processes running in the background. 

• Memory usage: we computed the memory peak usage as 
well as a detailed breakdown to assess which components 
of our implementation take up most memory (constraints. 
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Figure 15. Analysis of different pick function definitions 
on the incremental routing problem. 

shadow memory, reactive memory, stale and non-stale 
dependencies, etc.). 

• DC-related statistics: we collected detailed profihng in- 
formation including counts of patched instructions, stale 
dependencies cleanups, allocated/deallocated reactive 
blocks, created/deleted constraints, constraints executed 
per update, and distinct constraints executed per update. 

All DC programs considered in this section, except for sp 
that will be discussed separetely, use the default timestamp- 
based comparator for constraint scheduling. 

Experimental Platform. The experiments were performed 
on a PC equipped with a 2.10 GHz Intel Core 2 Duo with 3 
GB of RAM, running Linux Mandriva 2010.1 with Qt 4.6. 
All programs were compiled with gcc 4.4.3 and optimiza- 
tion flag -03. 

7.3 Incremental Computation 

As observed in Section |331 the reactive nature of our mixed 
imperative/dataflow framework makes it a natural ground 
for incremental computation. In this section, we present ex- 
perimental evidence that a constraint-based solution in our 
framework can respond to input updates very efficiently. We 
first show that the propagation times are comparable to state 
of the art automatic change propagation frameworks, such as 
CEAL [25], and for some problems can be orders of magni- 
tude faster than recomputing from scratch. We then consider 
a routing problem on real road networks, and compare our 
DC-based solution both to a conventional implementation 
and to a highly optimized ad hoc dynamic algorithm sup- 
porting specific update operations. 

Comparison to CEAL. Table [T] summarizes the outcomes 
of our experimental comparison with the conventional ver- 
sion and with CEAL for all common benchmarks. Input 
size is 71 = 1, 000, 000 for all tests (with the exception of 



msorter, for which n = 100, 000), where n is the length 
of the input list for the list-based benchmarks, and the num- 
ber of nodes in the (balanced) input tree for exptrees. Ta- 
ble [T]reports from-scratch execution times of both DC and 
CEAL (compared to the corresponding conventional imple- 
mentations), average propagation times in response to small 
changes of the input, memory usage and some DC stats (av- 
erage number of executed constraints per update, executable 
instrumentation time, and total number of patched instruc- 
tions). The experiments show that our DC implementation 
performs remarkably well. From-scratch times are on aver- 
age a factor of 1.4 higher than those of CEAL, while prop- 
agation times are smaller by a factor of 4 on average for 
all tests considered except the adder, yielding large speed- 
ups over complete recalculation. In the case of the adder 
benchmark, DC leads by a huge margin in terms of propa- 
gation time (see Figure fl4h and Figure fl4b). which can be 
attributed to the different asymptotic performance of the al- 
gorithms handling the change propagation (constant for DC, 
and logarithmic in the input size for the list reduction ap- 
proach used by CEAL). We remark that the logarithmic 
bound of self-adjusting computation could be reduced to 
constant by using a traceable accumulator, as observed in 
Section|5T|(however, support for traceable data structures is 
not yet integrated in CEAL). 

We also investigated how DC and CEAL scale in the case 
of batches of updates that change multiple input items si- 
multaneously. The results are reported in Figure \T4k for the 
representative mapper benchmark, showing that the selec- 
tive recalculations performed by DC and CEAL are faster 
than recomputing from scratch for changes up to significant 
percentages of the input. 



Comparison to ad hoc Incremental Shortest Paths. We 
now consider an application of the shortest path algorithm 
discussed in Section [373] to incremental routing in road net- 
works. We assess the empirical performance of a constraint- 
based solution implemented in DC (sp) by comparing it 
with Goldberg's smart queue implementation of Dijkstra's 
algorithm (sq), a highly-optimized C++ code used as the 
reference benchmark in the 9th DIMACS Implementation 
Challenge flB], and with an engineered version of the ad hoc 
incremental algorithm by Ramalingam and Reps (rr) |1^ 
[36I1 . Our code supports update operations following the high- 
level description given in Figure|5] except that we create one 
constraint per node, rather than one constraint per edge. We 
used as input data a suite of US road networks of size up to 
1 .5 million nodes and 3.8 million edges derived from the UA 
Census 2000 TIGER/Line Files [37]. Edge weights are large 
and represent integer positive travel times. We performed on 
each graph a sequence of m/10 random edge weight de- 
creases, obtained by picking edges uniformly at random and 
reducing their weights by a factor of 2. Updates that did not 
change any distances were not counted. 
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2,840 
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102.15 
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Table 2. Perfomiance evaluation of DC for incremental routing in US road networks using up to 1.5 million constraints. 
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Figure 16. Comparison with signal-slot mechanism in Qt: (a) buttongrid; (b) othello. 



The results of our experiments are shown in Table |2] and 
Figure [15] Both sp and rr were initiaUzed with distances 
computed using sq, hence we report from-scratch time only 
for this algorithm. Due to the nature of the problem, the 
average number of node distances affected by an update is 
rather small and almost independent of the size of the graph. 
Analogously to the incremental algorithm of Ramalingam 
and Reps, the automatic change propagation strategy used 
by our solver takes full advantage of this strong locality, re- 
evaluating only affected constraints and delivering substan- 
tial speedups over static solutions in typical scenarios. Our 
DC-based implementation yields propagation times that are, 
on average, a factor of 1.85 higher than the conventional ad 
hoc incremental algorithm, but it is less complex, requires 
fewer lines of code, is fully composable, and is able to re- 
spond seamlessly to multiple data changes, relieving the pro- 
grammer from the task of implementing explicitly change 
propagation. We also tested sp with different types of sched- 
ulers. By customizing the pick function of the default prior- 
ity queue scheduler (giving highest priority to nodes closest 
to the source), a noticeable performance improvement has 
been achieved (see Figure [TsTl. We also tried a simple stack 
scheduler, which, however, incurred a slowdown of a factor 
of 4 over the default scheduler. 

7.4 Comparison to Qt's Signal-slot Mechanism 

Maintaining relations between widgets in a graphic user in- 
terface is one of the most classical applications of dataflow 
constraints fi^. We assess the performance of DC in event- 



intensive interactive applications by comparing the DC im- 
plementations of buttongrid and othello with the con- 
ventional versions built atop Qt's signal-slot mechanism. 

In buttongrid, each constraint computes the size and 
position of a button in terms of the size and position of 
adjacent buttons. We considered user interaction sessions 
with continuous resizing, which induce intensive schedul- 
ing activity along several propagation chains in the acyclic 
dataflow graph. In othello, constraints are attached to cells 
of the game board (stored in reactive memory) and main- 
tain a mapping between the board and its graphical repre- 
sentation: in this way, the game logic can be completely un- 
aware of the GUI backend, as prescribed by the observer 
pattern (see Section |S!2] |. For both benchmarks, we experi- 
mented with different grid/board sizes. Figure [T6]plots the 
average time per resize event (buttongrid) and per game 
move (othello), measured over 3 independent runs. Both 
the total time and the change propagation time are reported. 
For all values of n, the performance differences of the DC 
and Qt conventional implementations are negligible and the 
curves are almost overlapped. Furthermore, the time spent 
in change propagation is only a small fraction of the total 
time, showing that the overhead introduced by access viola- 
tions handling, instrumentation, and scheduling in DC can 
be largely amortized over the general cost of widget man- 
agement and event propagation in Qt and in its underlying 
layers. 
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Vector-matrix product: memory usage as a function of block size 
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Figure 17. Analysis of vecmat as a function of the block size: (a) memory usage; (b) update time for two different kinds of 
updates. Point labels indicate the number of constraints executed in each batch (for single cells updates, the number of executed 
constraints is always 1, but a constraint does more work as b increases). 
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Figure 18. Time required for dynamic instrumentation. 

7.5 Fine-grained vs. Coarse-grained Decompositions 

A relevant feature of DC is that designers can flexibly de- 
cide at which level of granularity a given algorithmic solu- 
tion can be decomposed into smaller parts, i.e., they might 
use a single constraint that performs the entire computation 
(coarse-grained decomposition), or many constraints each 
computing only a small portion of the program's state (fine- 
grained decomposition). In reactive scenarios, where con- 
straints are re-evaluated selectively only on the affected por- 
tions of the input, this design choice can have implications 
both on memory usage and on running time. To explore these 
tradeoffs, we experimented with matrix benchmarks matmat 
and vecmat. For brevity, in this section we focus on vecmat, 
the results for matmat being similar 

Let y be a vector of size n and let M be a reactive matrix 
of size nxn. Our implementation of the vector-matrix prod- 
uct algorithm is blocked: constraints are associated to blocks 
of matrix cells, where a block is a set of consecutive cells 
on the same column. If the block size is 1, then there is one 
constraint per matrix cell: constraint Cij is responsible of 
updating the j-th entry of the output vector with the product 
y [i] * 7\f [z] [j] . This can be done in O ( 1 ) time by maintaining 
a local copy of the old product value and updating the result 
with the difference between the new value and the old one. 



If the block size is n, then there is a constraint per matrix 
column: constraint Cj associated with column j computes 
the scalar product between V and Af and updates the 
j-th entry of the output vector with the new value. The ap- 
proach can be naturally adapted to deal with any block size 
be [l,n]. 

In Figure[T7]we report the outcome of an experiment with 
n = 2000 in which we increased the block size 6 from 1 to n. 
As shown in Figure [iTh. the memory usage is inversely pro- 
portional to b, and thus directly proportional to the number 
of constraints: the memory used for maintaining constraints 
is about half of the total amount when b = 1, and negligible 
when b = n. All the other components (in particular, reactive 
memory, shadow memory, and dependencies) do not depend 
on the specific block size and remain constant. Figure [T7b 
shows the effect of b on the change propagation times for 
two different kinds of updates. For single cells updates, the 
time scales linearly with b (axes are on a log-log scale): this 
confirms the intuition that if an update changes only a sin- 
gle cell, implementations using larger block sizes perform a 
lot of unnecessary work. The scenario is completely differ- 
ent if single updates need to change entire columns (which is 
a typical operation for instance in incremental graph reacha- 
bility algorithms jlm ): in this case, change propagation time 
not only is not penalized by larger block sizes, but it is also 
slightly improved. This is due to the fact that larger values 
of 6 yield a smaller number of constraints, which induces 
smaller scheduling activity. The improvement, however, is 
modest, suggesting that DCs constraint scheduling over- 
head is modest compared to the overall work required to 
solve a given problem even at the finest-grained decompo- 
sition where there is one constraint per matrix cell. 

7.6 Instrumentation Overhead 

As a final set of experiments, we have measured how in- 
strumentation time scales as a function of the executable 
file size. We noticed that the performance overheads are 
dominated by the initial static binary code analysis phase 
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performed during DCs initialization, which scans the code 
to index memory access instructions as described in Sec- 
tion 16.11 The times required for access violation handling 
and just-in-time binary code patching are negligible com- 
pared to the overall execution times in all tested applications 
and are not reported. The experiment was conducted by ini- 
tializing DC on executable files obtained by linking stati- 
cally object files of increasing total size. The results are re- 
ported in Figure[T8]and indicate that DC scales linearly, with 
total instrumentation times being reasonably small even for 
large executable sizes. 

8. Related Work 

The ability of a program to respond to modifications of its 
environment is a feature that has been widely explored in a 
large variety of settings and along rather different research 
lines. While this section is far from being exhaustive, we 
discuss some previous works that appear to be more closely 
related to ours. 

GUI and Animation Toolkits. Although dataflow pro- 
gramming is a general paradigm, dataflow constraints have 
gained popularity in the 90's especially in the creation of 
interactive user interfaces. Amulet ll34ll and its predeces- 
sor Garnet ll33il are graphic user interface toolkits based on 
the dataflow paradigm. Amulet integrates a constraint solver 
with a prototype-instance object model implemented on top 
of C++, and is closely related to our work. Each object, 
created by making an instance of a prototype object, con- 
sists of a set of properties (e.g., appearance or position) that 
are stored in reactive variables, called slots. Constraints are 
created by assigning formulas to slots. Values of slots are ac- 
cessed through a Get method that, when invoked from inside 
of a formula, sets up a dependency between slots. A variety 
of approaches have been tested by the developers to solve 
constraints [38]. FRAN (Functional Reactive Animation) 
provides a reactive environment for compo sing multime- 
dia animations through temporal modeling [39'|: graphical 
objects in FRAN use time-varying, reactive variables to au- 
tomatically change their properties, achieving an animation 
that is function of both events and time. 

Reactive Languages. The dataflow model of computation 
can be also supported directly by programming languages. 
Most of them are visual languages, often used in industrial 
settmgs and allow the programmer to directly man- 
age the dataflow graph by visually putting links between 
the various entities. Only a few non-visual languages pro- 
vide a dataflow environment, mostly for specific domains. 
Among them. Signal ll2411 and Lustre |13] are dedicated to 
programming real-time systems found in embedded soft- 
ware, and SystemC is a system-level specification and 
design language based on C-H-. The data-driven Alpha lan- 
guage provided by the Leonardo software visualization sys- 
tem allows programmers to specify declarative mappings 



between the state of a C program an a graphical represen- 
tation of its data structures [161 . Recently, Meyerovich et 
al. I32] have introduced Flapjax, a reactive extension to the 
JavaScript language targeted to Web applications. Flapjax 
offers behaviors (e.g., variables whose value changes are au- 
tomatically propagated by the language), and event streams 
(e.g., potentially infinite streams of discrete events, each of 
which triggers additional computations). SugarCubes lll2ll 
and ReactiveML [31] allow reactive programming (in Java 
and OCAML, respectively) by relying not on operating sys- 
tem and runtime support, as our approach does, but rather on 
causality analysis and a custom interpreter/compiler These 
systems, however, track dependencies between functional 
units, through the use of specific language constructs, such 
as events, and explicit commands for generating and waiting 
for events. 

Constraint Satisfaction. Dataflow constraints fit within 
the more general field of constraint programming [7J. Terms 
such as "constraint propagation" and "constraint solving" 
have often been used in papers related to dataflow since the 
early developments of the area 111 ll l34l 13811 . However, the 
techniques developed so far in dataflow programming are 
quite distant from those appearing in the constraint program- 
ming literature [9J. In constraint programming, relations be- 
tween variables can be stated in the form of multi-way con- 
straints, typically specified over restricted domains such as 
real numbers, integers, orBooleans. Domain-specific solvers 
use knowledge of the domain in order to forbid explicitly 
values or combinations of values for some variables [|9|, 
while dataflow constraint solvers are domain-independent. 
Moving from early work on attribute grammars [inl |29||, 
a variety of incremental algorithms for performing effi- 
cient dataflow constraint satisfaction have been proposed 
in the literature and integrated in dataflow systems such as 
Amulet. These algorithms are based either on a mark-sweep 
approach iflTi IZTIl . or on a topological ordering [6, ,26il . 
In contrast, DC uses a priority-based approach, which al- 
lows users to customize the constraint scheduling order. 
Mark-sweep algorithms are preferable when the dataflow 
graph can change dynamicafly during constraint evaluation: 
this may happen if constraints use indirection and condi- 
tionals, and thus cannot be statically analyzed. With both 
approaches, if there are cyclic dependencies between con- 
straints, they are arbitrarily broken, paying attention to eval- 
uate each constraint in a cycle at most once. Compared to 
our iterative approach, this limits the expressive power of 
constraints. 

Self-adjusting Computation. A final related area, that we 
have extensively discussed throughout the paper, is that of 
self-adjusting computation, in which programs respond to 
input changes by updating automatically their output. This is 
achieved by recording data and control dependencies during 
the execution of programs so that a change propagation 
algorithm can update the computation as if the program 
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were run from scratch, but executing only those_parts of the 
computation affected by changes. We refer to ||3l |4|, |25|] for 
recent progress in this field. 

9. Future Work 

The work presented in this paper paves the road to several 
further developments. Although conventional platforms of- 
fer limited support for implementing reactive memory effi- 
ciently, we believe that our approach can greatly benefit from 
advances in the hot field of transactional memories, which 
shares with us the same fundamental need for a fine-grained, 
highly -efficient control over memory accesses. Multi-core 
platforms suggest another interesting direction. Indeed, ex- 
posing parallelism was one of the motivations for dataflow 
architectures, since the early developments of the area. We 
regard it as a challenging goal to design effective models and 
efficient implementations of one-way dataflow constraints in 
multi-core envkonments. 
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